Evaluating Branch Predictors on an SMT Processor
نویسندگان
چکیده
Simultaneous multithreading (SMT) provides significant increases in microprocessor throughput by issuing instructions from multiple threads per clock cycle. SMT can be realized in a wide-issue superscalar with a modest increase in resources, because much of the hardware is shared among the multiple thread contexts. Branch prediction accuracy, a key component of microprocessor performance, can suffer greatly from destructive interference caused by multiple threads sharing the same branch prediction hardware. Although SMT processors are able to hide branch latencies by scheduling instructions from other threads, branch prediction accuracy is still important in SMT processors to reduce latencies of individual threads and to provide a diverse mix of threads to the instruction scheduler. This paper evaluates several wellknown branch prediction schemes, and examines some modifications to address problems caused by sharing of branch prediction hardware in SMT.
منابع مشابه
CASH: Revisiting Hardware Sharing in Single-Chip Parallel Processors
As the increasing of issue width has diminishing returns with superscalar processor, thread parallelism with a single chip is becoming a reality. In the past few years, both SMT (Simultaneous MultiThreading) and CMP (Chip MultiProcessor) approaches were first investigated by academics and are now implemented by the industry. In some sense, CMP and SMT represent two extreme design points. In thi...
متن کاملAn Effective Bypass Mechanism to Enhance Branch Predictor for SMT Processors
Unlike traditional superscalar processors, Simultaneous Multithreaded processor can explore both instruction level parallelism and thread level parallelism at the same time. With a same fetch width, SMT fetches instructions from a single thread not so deeply as in traditional superscalar processor. Meanwhile, all the instructions from different threads share the same Function Unites in SMT. All...
متن کاملTolerating Branch Predictor Latency on SMT
Simultaneous Multithreading (SMT) tolerates latency by executing instructions from multiple threads. If a thread is stalled, resources can be used by other threads. However, fetch stall conditions caused by multi-cycle branch predictors prevent SMT to achieve all its potential performance, since the flow of fetched instructions is halted. This paper proposes and evaluates solutions to deal with...
متن کاملA latency-conscious SMT branch prediction architecture
Executing multiple threads has proved to be an effective solution to partially hide latencies that appear in a processor. When a thread is stalled because a long-latency operation is being processed, like a memory access or a floatingpoint calculation, the processor can switch to another context so that another thread can take advantage of the idle resources. However, fetch stall conditions cau...
متن کاملEvaluation of dynamic branch predictors for modern ILP processors
Modern instruction-level parallel (ILP) processors use superscalar architectures with deep pipelines in order to execute multiple instructions per cycle. The frequency and behavior of branch instructions seriously hinder performance of ILP processors. Various mechanisms, both at the compiler, as well as the processor level, have been proposed to predict the branch behavior. This work investigat...
متن کامل